This is a collection of web page captures from links added to, or changed on, Wikipedia pages. The idea is to bring a reliability to Wikipedia outlinks so that if the pages referenced by Wikipedia articles are changed, or go away, a reader can permanently find what was originally referred to. This is part of the Internet Archive's attempt to rid the web of broken links .
Topics: Wikipedia, Wikimedia
A daily crawl of more than 200,000 home pages of news sites, including the pages linked from those home pages. Site list provided by The GDELT Project
Topics: GDELT, News
This is a collection of pages and embedded objects from WordPress blogs and the external pages they link to. Captures of these pages are made on a continuous basis seeded from a feed of new or changed pages hosted by Wordpress.com or by Wordpress pages hosted by sites running a properly configured Jetpack wordpress plugin.
Topics: Wordpress.com, blogs, jetpack
Internet Archive crawldata from feed-driven WordPress Crawl, captured by wwwb-crawl08.us.archive.org:no404 from Sat Jun 2 10:26:38 PDT 2018 to Sat Jun 2 13:37:35 PDT 2018.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Fri May 5 03:05:22 PDT 2017 to Fri May 5 20:22:10 PDT 2017.
Topics: no404, wikipedia, crawldata
3.3M
3.3M
Jul 1, 2018
07/18
by
Internet Archive
web
eye 3.3M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Mon Jun 25 14:47:13 PDT 2018 to Sat Jun 30 23:44:27 PDT 2018.
Topic: crawldata
1.7M
1.7M
May 6, 2017
05/17
by
Internet Archive
web
eye 1.7M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat May 6 16:20:29 PDT 2017 to Sat May 6 10:25:37 PDT 2017.
Topic: crawldata
1.3M
1.3M
Jun 10, 2017
06/17
by
Internet Archive
web
eye 1.3M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat Jun 10 10:04:42 PDT 2017 to Sat Jun 10 04:04:56 PDT 2017.
Topic: crawldata
1.3M
1.3M
Jun 10, 2017
06/17
by
Internet Archive
web
eye 1.3M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat Jun 10 16:37:34 PDT 2017 to Sat Jun 10 11:06:28 PDT 2017.
Topic: crawldata
1.3M
1.3M
Jun 11, 2017
06/17
by
Internet Archive
web
eye 1.3M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sun Jun 11 02:32:54 PDT 2017 to Sat Jun 10 20:55:43 PDT 2017.
Topic: crawldata
1.3M
1.3M
Jun 10, 2017
06/17
by
Internet Archive
web
eye 1.3M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat Jun 10 21:26:19 PDT 2017 to Sat Jun 10 16:13:06 PDT 2017.
Topic: crawldata
1.2M
1.2M
Jun 10, 2017
06/17
by
Internet Archive
web
eye 1.2M
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat Jun 10 13:08:14 PDT 2017 to Sat Jun 10 06:59:57 PDT 2017.
Topic: crawldata
956,715
957K
Jan 25, 2017
01/17
by
Internet Archive
web
eye 956,715
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl409.us.archive.org:gdelt from Fri Jan 20 14:31:54 PST 2017 to Fri Jan 20 07:48:07 PST 2017.
Topic: crawldata
950,263
950K
Sep 13, 2018
09/18
by
Internet Archive
web
eye 950,263
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl409.us.archive.org:gdelt from Thu Sep 13 02:07:17 PDT 2018 to Wed Sep 12 20:49:14 PDT 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl896.us.archive.org:no404 from Thu May 18 02:00:07 PDT 2017 to Thu May 18 01:34:36 PDT 2017.
Topics: no404, wikipedia, crawldata
864,795
865K
Sep 7, 2017
09/17
by
Internet Archive
web
eye 864,795
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Thu Sep 7 11:27:02 PDT 2017 to Thu Sep 7 06:24:47 PDT 2017.
Topic: crawldata
807,285
807K
Jan 28, 2018
01/18
by
Internet Archive
web
eye 807,285
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl409.us.archive.org:gdelt from Sun Jan 28 18:51:48 PST 2018 to Sun Jan 28 12:11:14 PST 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl896.us.archive.org:no404 from Sun Jun 17 16:59:56 PDT 2018 to Sun Jun 17 22:30:01 PDT 2018.
Topics: no404, wikipedia, crawldata
790,072
790K
Jan 28, 2018
01/18
by
Internet Archive
web
eye 790,072
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl409.us.archive.org:gdelt from Sun Jan 28 19:25:21 PST 2018 to Sun Jan 28 14:31:24 PST 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Jun 27 02:35:05 PDT 2015 to Fri Jun 26 21:13:31 PDT 2015.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl110.us.archive.org:no404 from Fri Jun 15 15:53:52 PDT 2018 to Sat Jun 16 01:58:14 PDT 2018.
Topics: no404, wikipedia, crawldata
669,154
669K
Jul 16, 2015
07/15
by
Internet Archive
web
eye 669,154
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Thu Jul 16 10:27:47 PDT 2015 to Thu Jul 16 04:43:26 PDT 2015.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Wed Oct 30 21:19:56 PDT 2013 to Wed Oct 30 15:58:29 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Tue Oct 7 09:36:28 PDT 2014 to Tue Oct 7 05:34:58 PDT 2014.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl344.us.archive.org:no404 from Sun Oct 11 05:26:52 PDT 2015 to Sun Oct 11 08:02:18 PDT 2015.
Topics: no404, wordpress, crawldata
471,802
472K
Feb 1, 2017
02/17
by
Internet Archive
web
eye 471,802
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Wed Feb 1 04:50:38 PST 2017 to Tue Jan 31 21:52:57 PST 2017.
Topic: crawldata
392,610
393K
Jun 13, 2018
06/18
by
Internet Archive
web
eye 392,610
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl409.us.archive.org:gdelt from Wed Jun 13 07:06:46 PDT 2018 to Wed Jun 13 01:14:17 PDT 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl458.us.archive.org:no404 from Wed Sep 11 01:48:56 PDT 2013 to Tue Sep 10 19:39:40 PDT 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Wed Feb 18 05:03:48 PST 2015 to Tue Feb 17 22:30:10 PST 2015.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl896.us.archive.org:no404 from Wed Jul 12 03:57:55 PDT 2017 to Tue Jul 11 22:18:15 PDT 2017.
Topics: no404, wikipedia, crawldata
356,605
357K
Mar 17, 2018
03/18
by
Internet Archive
web
eye 356,605
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sat Mar 17 09:27:03 PDT 2018 to Sat Mar 17 06:00:27 PDT 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Mon Nov 4 01:08:35 PST 2013 to Sun Nov 3 18:31:38 PST 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Fri Oct 11 18:57:27 PDT 2013 to Fri Oct 11 18:27:42 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl896.us.archive.org:no404 from Tue Jun 6 09:58:02 PDT 2017 to Tue Jun 6 05:29:32 PDT 2017.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl344.us.archive.org:no404 from Mon Dec 9 02:29:07 PST 2013 to Sun Dec 8 19:51:18 PST 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 02:06:46 PDT 2013 to Fri Oct 11 20:57:12 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sun Jan 11 20:40:00 PST 2015 to Sun Jan 11 17:30:17 PST 2015.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Jan 10 20:28:55 PST 2015 to Sat Jan 10 14:35:49 PST 2015.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 05:10:05 PDT 2013 to Fri Oct 11 23:33:01 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 01:17:32 PDT 2013 to Fri Oct 11 19:35:18 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 03:09:48 PDT 2013 to Fri Oct 11 21:36:24 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl106.us.archive.org:no404 from Wed Oct 31 22:29:30 PDT 2018 to Thu Nov 1 03:23:08 PDT 2018.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl108.us.archive.org:no404 from Thu Nov 1 00:49:53 PDT 2018 to Thu Nov 1 04:06:58 PDT 2018.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 04:03:47 PDT 2013 to Fri Oct 11 22:24:49 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl106.us.archive.org:no404 from Thu Nov 1 06:43:55 PDT 2018 to Thu Nov 1 09:20:06 PDT 2018.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl108.us.archive.org:no404 from Thu Nov 1 08:13:40 PDT 2018 to Thu Nov 1 10:12:18 PDT 2018.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl107.us.archive.org:no404 from Thu Nov 1 02:25:04 PDT 2018 to Thu Nov 1 05:03:57 PDT 2018.
Topics: no404, wordpress, crawldata
275,308
275K
Jan 29, 2018
01/18
by
Internet Archive
web
eye 275,308
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Mon Jan 29 00:17:27 PST 2018 to Sun Jan 28 17:44:33 PST 2018.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Wed Jan 18 03:26:20 PST 2017 to Tue Jan 17 21:10:10 PST 2017.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl344.us.archive.org:no404 from Fri Nov 8 18:07:43 PST 2013 to Fri Nov 8 11:24:54 PST 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl458.us.archive.org:no404 from Wed Oct 9 13:36:49 PDT 2013 to Wed Oct 9 07:59:25 PDT 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl818.us.archive.org:no404 from Tue Jan 10 14:41:43 PST 2017 to Tue Jan 10 10:59:29 PST 2017.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sun Sep 22 02:43:39 PDT 2013 to Sat Sep 21 21:49:05 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl458.us.archive.org:no404 from Mon Oct 7 06:39:20 PDT 2013 to Mon Oct 7 01:07:00 PDT 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Sep 21 22:25:59 PDT 2013 to Sat Sep 21 18:13:45 PDT 2013.
Topics: no404, wikipedia, crawldata
254,529
255K
Jun 5, 2016
06/16
by
Internet Archive
web
eye 254,529
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Sun Jun 5 12:10:10 PDT 2016 to Sun Jun 5 06:42:33 PDT 2016.
Topic: crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl344.us.archive.org:no404 from Mon Dec 2 21:56:25 PST 2013 to Mon Dec 2 15:29:07 PST 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 07:38:37 PDT 2013 to Sat Oct 12 02:15:16 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 06:01:08 PDT 2013 to Sat Oct 12 00:24:12 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl109.us.archive.org:no404 from Mon Sep 24 07:08:06 PDT 2018 to Mon Sep 24 22:54:26 PDT 2018.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl458.us.archive.org:no404 from Fri Nov 8 19:29:44 PST 2013 to Fri Nov 8 12:30:08 PST 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Sep 21 23:53:08 PDT 2013 to Sat Sep 21 19:42:29 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sun Oct 26 06:02:13 PDT 2014 to Sun Oct 26 00:44:36 PDT 2014.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Thu Mar 13 15:39:54 PDT 2014 to Thu Mar 13 10:29:53 PDT 2014.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sun Oct 13 22:05:29 PDT 2013 to Sun Oct 13 16:25:33 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Tue Dec 3 01:48:04 PST 2013 to Mon Dec 2 20:11:08 PST 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven WordPress Crawl, captured by crawl458.us.archive.org:no404 from Fri Nov 8 18:12:47 PST 2013 to Fri Nov 8 11:19:28 PST 2013.
Topics: no404, wordpress, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Mon Apr 13 23:31:41 PDT 2015 to Mon Apr 13 18:01:32 PDT 2015.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sun Dec 15 11:40:44 PST 2013 to Sun Dec 15 05:32:44 PST 2013.
Topics: no404, wikipedia, crawldata
227,588
228K
Oct 1, 2015
10/15
by
Internet Archive
web
eye 227,588
favorite 0
comment 0
Internet Archive crawldata from feed-driven GDELT Crawl, captured by crawl816.us.archive.org:gdelt from Thu Oct 1 15:26:49 PDT 2015 to Thu Oct 1 09:43:18 PDT 2015.
Topic: crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 11:08:48 PDT 2013 to Sat Oct 12 06:01:41 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Nov 2 18:33:00 PDT 2013 to Sat Nov 2 13:05:27 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Sep 21 06:02:57 PDT 2013 to Sat Sep 21 01:17:57 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Sep 21 12:19:50 PDT 2013 to Sat Sep 21 07:00:20 PDT 2013.
Topics: no404, wikipedia, crawldata
Internet Archive crawldata from feed-driven Wikipedia Outlinks Crawl, captured by crawl345.us.archive.org:no404 from Sat Oct 12 10:13:29 PDT 2013 to Sat Oct 12 04:53:58 PDT 2013.
Topics: no404, wikipedia, crawldata